Analysis of the algorithm: From kernels to backup genes.

Kernelization section

The algorithm transformed the semantic similarity matrix to make it compatible with a kernel. Once this was done for each network and kernel type, it was integrated by kernel type. Below there is a general analysis of the properties of each matrix in the different phases of the process.

Annotations properties

Table 1. Annotation files descriptors

Net Min Max Average Standard_Deviation
biological_process 1 134 6.999882332176266 11.432654770656663
cellular_component 1 40 4.162222345933308 5.25157343549579
disease 1 21 2.2250479846449136 2.909050012799259
interaction 1.0 729.0 29.8386134923593 54.03869234947398
molecular_function 1 26 3.0287856936832998 3.7159142158779024
pathway 1.0 191.0 4.003825833485152 8.704590940604282
phenotype 1 335 31.553476462477843 46.99329427183839

Matrix properties

Table 2. Similarity matrixes

Net Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process_sim 16997x16997 288898009 262329914
cellular_component_sim 17963x17963 322669369 322651406
disease_sim 4168x4168 17372224 16584998
interaction_sim 16098x16098 259145604 479348
molecular_function_sim 17335x17335 300502225 300484890
pathway_sim 3828x3828 14653584 159182
phenotype_sim 5077x5077 25775929 25770852

Table 3. Filtered similarity matrixes

Table 4. Uncombined kernel matrixes

Net Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
biological_process ct 16997x16997 288898009 288898009
biological_process el 16997x16997 288898009 288898009
biological_process ka 16997x16997 288898009 262346911
biological_process rf 16997x16997 288898009 288898009
cellular_component ct 17963x17963 322669369 322669369
cellular_component el 17963x17963 322669369 322669369
cellular_component ka 17963x17963 322669369 322669369
cellular_component rf 17963x17963 322669369 322669369
disease ct 4168x4168 17372224 17372224
disease el 4168x4168 17372224 17363890
disease ka 4168x4168 17372224 16589166
disease rf 4168x4168 17372224 17363890
interaction ct 16098x16098 259145604 259081193
interaction el 16098x16098 259145604 252047984
interaction ka 16098x16098 259145604 495446
interaction rf 16098x16098 259145604 252047984
molecular_function ct 17335x17335 300502225 300502225
molecular_function el 17335x17335 300502225 300502225
molecular_function ka 17335x17335 300502225 300502225
molecular_function rf 17335x17335 300502225 300502225
pathway ct 3828x3828 14653584 14432317
pathway el 3828x3828 14653584 8641524
pathway ka 3828x3828 14653584 163010
pathway rf 3828x3828 14653584 8641524
phenotype ct 5077x5077 25775929 25775929
phenotype el 5077x5077 25775929 25775929
phenotype ka 5077x5077 25775929 25775929
phenotype rf 5077x5077 25775929 25775929

Table 5. Integrated kernel matrixes

Integration Kernel Matrix_Dimensions Matrix_Elements Matrix_Elements_Non_Zero
integration_mean_by_presence ct 19460x19460 378691600 367014764
integration_mean_by_presence el 19460x19460 378691600 365111844
integration_mean_by_presence ka 19460x19460 378691600 350789298
integration_mean_by_presence rf 19460x19460 378691600 365111844
mean ct 19460x19460 378691600 367014764
mean el 19460x19460 378691600 365111844
mean ka 19460x19460 378691600 350789298
mean rf 19460x19460 378691600 365111844

Weight values